Online Determination of Value-Function Structure and Action-value Estimates for Reinforcement Learning in a Cognitive Architecture

نویسندگان

  • John E. Laird
  • Nate Derbinsky
چکیده

We describe how an agent can dynamically and incrementally determine the structure of a value function from background knowledge as a side effect of problem solving. The agent determines the value function as it performs the task, using background knowledge in novel situations to compute an expected value for decision making. That expected value becomes the initial estimate of the value function, and the features tested by the background knowledge form the structure of the value function. This approach is implemented in Soar, using its existing mechanisms, relying on its preference-based decision-making, impasse-driven subgoaling, explanation-based rulelearning (chunking), and reinforcement learning. We evaluate this approach on a multiplayer dice game in which three different types of background knowledge are used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

Cooperative Cognitive Agents and Reinforcement Learning in Pursuit Game

This paper illustrates how a self-organizing cognitive architecture, known as TD-FALCON, can learn to function and cooperate in a dynamic environment. TD-FALCON learns the value functions of the stateaction space estimated through a temporal difference (TD) method. The learned value functions are then used to determine the optimal actions based on an action selection policy. To tackle a multi-a...

متن کامل

Q-Networks for Binary Vector Actions

In this paper reinforcement learning with binary vector actions was investigated. We suggest an effective architecture of the neural networks for approximating an action-value function with binary vector actions. The proposed architecture approximates the action-value function by a linear function with respect to the action vector, but is still non-linear with respect to the state input. We sho...

متن کامل

Web pages ranking algorithm based on reinforcement learning and user feedback

The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...

متن کامل

Geodesic Distance-based Kernel Construction for Gaussian Process Value Function Approximation

Finding accurate approximations to state and action value functions is essential in Reinforcement learning tasks on continuous Markov Decision Processes. Using Gaussian processes as function approximators we can simultaneously represent model confidence and generalize to unvisited states. To improve the accuracy of the value function approximation in this article I present a new method of const...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012